Skip to content

Add ESC to abort image pull with fall-back to local#341

Draft
gtsiolis wants to merge 2 commits into
mainfrom
pro-324-esc-to-abort-image-pull-with-fall-back-to-local-b20c
Draft

Add ESC to abort image pull with fall-back to local#341
gtsiolis wants to merge 2 commits into
mainfrom
pro-324-esc-to-abort-image-pull-with-fall-back-to-local-b20c

Conversation

@gtsiolis

@gtsiolis gtsiolis commented Jun 25, 2026

Copy link
Copy Markdown
Member

Pulling a floating-tag emulator image (latest/empty) used to block the entire start with no way out — the "20-minute wait at a booth with no abort" complaint. This makes the pull interruptible and resilient, without a pre-pull prompt or a registry digest probe.

Implements PRO-324. Builds on PRO-323 (ImageExists). Idea credit: Carole.

Behavior

  • Bail out of a pull. When a newer version is downloading, a local copy already exists, and we're interactive, the spinner switches to:
    Pulling new version… (press esc to keep current version)
    
    Pressing ESC cancels only the in-flight pull and starts the emulator on the existing local image. Ctrl+C (or q) still quits the whole program.
  • Automatic offline fall-back. If a pull fails (e.g. offline) and a local copy exists, lstk falls back to it automatically — in any mode — instead of aborting the start.
  • Still fatal when it should be. A pull that fails with no local copy stays fatal (ErrorEvent + silent error + telemetry), exactly as before.

The hint appears only once Docker reports actual layer download, so it never flashes on an "up to date" / cache hit.

How it works

  • pullImages checks ImageExists for every image and delegates each pull to a new pullImage helper.
  • pullImage runs the pull under a per-pull cancel context and selects on either the pull result or a buffered skip channel.
  • The domain stays UI-free: it emits a new output.PullSkippableEvent carrying a cancel channel it owns (mirroring the existing UserInputRequestEvent / ResponseCh shape). The TUI binds ESC during the pulling phase to signal that channel; ESC is inert otherwise. Non-interactive mode never emits the event, and PlainSink has no rendering for it.
  • The affordance is gated on localExists && interactive and only armed after a Downloading progress status arrives, so the domain never blocks where a prompt couldn't be answered.

Robustness fixes (second commit)

Wiring the new select + progress-drain surfaced two latent issues, fixed here:

  • Pull-progress deadlock. DockerRuntime.PullImage returned an immediate ImagePull error before registering defer close(progress), leaving the progress channel open on that path. pullImage now waits for the progress goroutine to drain, which turned the old benign goroutine leak into a hang of the whole start flow — squarely in the offline scenario this feature targets. PullImage now closes progress unconditionally.
  • Ctrl+C conflated with a pull failure. A cancelled parent context surfaced as a pull error and, with a local image present, was swallowed by the fall-back (returning success), so the start continued on an already-cancelled context instead of aborting. pullImage now propagates context.Canceled before the fall-back.

Decisions

  • Auto-fall-back-on-error defaults on when a local image exists (the issue's open question) — a silent fall-back beats a hard abort at a booth.
  • ESC cancels the pull only, never the program.
  • No registry digest probe — we react to Docker's own download signal instead.

Tests

  • internal/container/start_test.go — ESC → skip + use local; pull error + local → auto fall-back; pull error + no local → fatal; non-interactive → never skippable; parent-context cancel → propagates (not swallowed).
  • internal/runtime/docker_test.goPullImage closes progress even when ImagePull fails immediately (timeout-guarded deadlock regression).
  • internal/ui/app_test.go — ESC signals the skip channel and shows the hint without quitting; ESC is inert with no pull in flight.
  • internal/output/plain_sink_test.goPlainSink suppresses PullSkippableEvent.

The issue called for PTY integration coverage, but the two real-world triggers are non-deterministic to stage — forcing an offline pull error means intercepting the hardcoded registry, and an ESC-during-pull PTY test depends on pull timing. The behavior is instead covered by the deterministic unit/UI tests above.

@gtsiolis gtsiolis self-assigned this Jun 25, 2026
@gtsiolis gtsiolis added semver: patch docs: needed Pull request requires documentation updates labels Jun 25, 2026
gtsiolis and others added 2 commits June 26, 2026 15:19
During a floating-tag pull, if a local copy of the image is already present
and we're interactive, show "press esc to keep current version" once real
layer download begins. Pressing ESC cancels only the in-flight pull (Ctrl+C
still quits) and starts the emulator with the existing local image. A failed
pull (e.g. offline) with a local copy present now falls back automatically
instead of aborting the start.

The domain stays UI-free: pullImage runs the pull under a per-pull cancel
context and selects on a skip channel the TUI signals via the new
PullSkippableEvent.

Generated with [Linear](https://linear.app/localstack/issue/PRO-324/esc-to-abort-image-pull-with-fall-back-to-local#agent-session-ee6a32f4)

Co-authored-by: linear-code[bot] <222613912+linear-code[bot]@users.noreply.github.com>
Two correctness fixes in the ESC-to-abort-pull flow (PRO-324):

1. Deadlock on an immediate pull error. DockerRuntime.PullImage returned
   the error from client.ImagePull before registering `defer close(progress)`,
   so the progress channel was never closed on that path. pullImage now waits
   on the progress goroutine to drain (`<-progressDone`), turning that prior
   goroutine leak into a hang of the whole start flow — exactly in the
   offline/unreachable-registry scenario this feature targets. Close progress
   unconditionally so callers can never hang. Regression test points a
   DockerRuntime at an unreachable socket and asserts progress is closed.

2. Ctrl+C during a pull was conflated with a pull failure. A cancelled parent
   context surfaced as a pull error and, when a local image existed, was
   swallowed by the local-image fall-back (returning success), so the start
   continued on an already-cancelled context instead of aborting. Propagate
   context.Canceled before the fall-back. Regression test cancels the parent
   context mid-pull and asserts the cancellation propagates.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@gtsiolis gtsiolis force-pushed the pro-324-esc-to-abort-image-pull-with-fall-back-to-local-b20c branch from c96a6a7 to 970b218 Compare June 26, 2026 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs: needed Pull request requires documentation updates semver: patch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant